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Separation  Theorem  in  Linear  Stochastic 
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ABSTRACT 

For  linear  stochastic  systems  with  time  delay,  the  optimal  control  ia 
derived  that  minimizes  the  ensemble  average  of  a  quadratic  (in  states  and 
control)  performance  measure*  The  optimal  control  obtained  is  functionally 
dependent  upon  the  expected  values  of  the  state  variables  conditioned  on 
the  measurements.  It  is  shown  that  the  optimal  control  and  estimation  can 
be  performed  independently]  i.e.,  the  separation  theorem  holds  for  the 
class  of  problems  considered.  The  optimal  control  is  linearly  dependent 
upon  the  best  estimates  which  minimize  the  expected  value  of  the  estimation 
error  squared. 


fThie  work  was  supported  in  part  by  National  Science  Foundation  Grant 
No.  GK-1970  and  in  part  by  Air  Force  Office  of  Scientific  Research  Grant 
No.  69-1776 


Introduction 


The  well-known  separation  theorem  states  that  the  combined  problem  of 
optimal  control  and  estimation  can,  under  certain  conditions,  be  treated 
as  two  independent  problems.  It  holds,  for  example,  when  the  state  transition 
in  the  plant  and  the  observation  equations  are  linear  in  the  state  variables 
containing  additive  white  Gaussian  noises  and  when  the  performance  criterion 
is  quadratic  in  state  and  control  [1,4, 5, 6],  A  more  rigorous  treatment  of 
these  conditions  is  presented  in  [5]  for  continuous-time  stochastic  systems. 
The  separation  theorem  for  discrete-time  stochastic  systems  is  given  in  [1]. 
Ihe  purpose  here  is  to  present  the  separation  theorem  for  linear  stochastic 
systems  described  by  linear  differential-difference  equations  of  retarded 
type  when  the  performance  measure  to  be  minimized  is  a  quadratic  function  of 
the  state  variables  and  control. 

A  differential-difference  equation  is  an  equation  which  contains  an 
unknown  function  and  its  derivatives  which  are  evaluated  at  the  values  of 
the  arguments  differing  by  seme  specified  amount.  Such  mathematical  models 
appear  commonly  in  aerospace  application  as  well  as  in  industrial  processes. 
The  state  transition  in  these  systems  is  a  function,  say,  of  state  x(t) 
evaluated  at  time  t  and  state  x(t-i)  evaluated  at  time  t-T,  where  t  represents 
the  time  delay. 

Necessary  conditions  to  determine  the  optimal  open-loop  control  for 
systems  with  time  delay  is  fairly  well  established  [e.g,,  7].  More  recently 
a  feedback  solution  to  the  optimization  of  a  system  with  linear  plant- 
quadratic  criterion  has  also  been  attained  [8, 8].  A  solution  to  the 
optimal  filtering  in  linear  systems  with  time  delays  was  proposed  in  [3] 
using  the  principle  of  orthogonal  projections  in  Hilbert  space.  The 
optimal  control  is  derived  here  for  linear  stochastic  systems  with  time 
delay  by  means  of  the  dynamic  programming.  Then,  the  known  results  of  [3] 


to  the  optimal  filtering  problem  are  applied  to  the  ease  which  results  from  the 
application  of  the  (optimal  control  in  the  stochastic  system  with  time  delay. 

Statement  of  the  Problem 

The , state  transition  of  a  plant  is  governed  by  a  stochastic  differential- 

« 

difference  equation 

dx(t)  ■  A^tJxCtJdt  +  -A2(t)x(t-T)dt  +  Bu(t)dt  +  D(t)  dw(t)  (l) 

where  t  €  ft0,T],  T  <co  ;  x(t)  *  eolfx^(t),  ...,xn(t))  represents  the  system 
state  at  time  t,  and  x(t-x)  at  time  t-r,  where  x  is  a  constant;  A^t),  Ag(t), 

B(t)  and  D(t)  are  bounded  matrices  with  elements  in  C*  for  all  t;  w(t)  signifies 
a  Brownian  motion  proceaa  with  covariance  (the  superscript  denotes  transposition). 


E[[v(t2)  -  w^)]  fw(tg)  -  -  C^C^Htg-ti),  tg  >  tx  (2a) 

E[w(t2)  -  w(tx)]  =  0  (Zb) 

The  Initial  function  represents  a  Gaussian  process  specified  by 

E[x(t)]  -  x(t),  t0-x  <  t  <  tQ  (3a) 

*{[x(t +e)  -  x(t  +e))  [x(t  +<t)  -  x(t  +<y))')  o  P(t  ,8,o)  (3b) 

o  o  o  o  o 

where  -x  <  e,  a  <  0. 

The  observations  are  performed  according  to 

dz(t)  -  C1(t)x(t)dt  +  Cg(t)x(t-x)dt  +  dv(t)  (fc) 


where  (^(t)  and  C2(t)  are  bounded  and  continuous  matrices  of  proper  dimensions, 
¥  t  €  ft^T]  and  v(t),  t  €  [tQ,  T]  is  a  Brownian  motion  process  with  the 
covariance 

EfMt^-vttg)]  tv(t1)-v(t2))#)  «  t2  >  tL  (5a) 

B[v(t2)-v(t1)J  «  0  (5b) 

It  is  also  assumed  that  x(t),  t  6  [t  -t,  t  )  and  w(t),  v(t),  t  €  [t  ,T) 

o  o  o 


are  Independent  randan  processes. 


The  control  problem  is  to  determine  the  deterministic  optimal ‘control 
u*(t>*)  *o  as  to  minimise 

JCu]  »  Et^£  Cx'(t)w(t)x(t)  +  u'(t)R(t)u(t)]dt}  (6) 

where  W(t)  and  R(t)  aro  continuous  matrices  which  are  positive  semidefinite 
and  positive  definite,  respectively,  Vt.  The  operator  E„  .  signifies  a 

Zt)  w 

conditional  expectation,  i.e.,  E  .{*}  «  E(*|z(S)>  t  <  5  <t}. 

Before  solving  the  stochastic  optima)  control  problem,  some  preliminary 
Material  is  first  presented. 

Introductory  Material 

Suppose  we  are  given  the  following  differential  equation  of  retarded 

type: 

x(t)  «=  F1(t)jc(t)+P2(t)x(t-T)  ♦  J°  F?(t,s)x(t+s)d s  (7) 

where  te[to,T];  F^(t),  F2(t)  and  F^(t,s)  are  bounded  matrices  with  elements 
in  C  for  all  t.  Fj(t,s)  signifies  the  kernel  of  the  equation.  The 
initial  function  for  system  (7)  is  given  by  x(t),  tQ-T  <  t  <  tQ. 

Let  t(t,s)  represent  an  (n  x  n) -matrix,  which  satisfies 

-  F1(t)*(t,s)+F2(t)t(t-T,s)+  J°  F5(t,a)*(t+a,s)da  (8) 

where  te[to,T]j  t(t,t)  =  I,  t(t,s)  *  0  for  t  <  s.  It  is  noted  that  t(t,s) 
corresponds  to  the  fundamental  matrix  of  the  ordinary  differential 
equations.  Also,  t(s,t),  s  >  t,  satisfies  the  adjoint  equations  relative 
to  the  second  argument: 

^  *  -^BjtjF^t)  -t(s,t+T)F2(t+T) 

-  t(s,t+T+a)F_(t+T+a,-a-T)da;  t0  <  t  <  T-t 


(9) 


-5 


=  -♦(s,t)F1(t)-  j  t(  3,  t+T+a)F,(t+T4-o,  -a-a)da,  T-t  <  t;<  T  (lo) 

where  the  bound*’ y  conditions  are  provided  by  i(s,s)  «  I  and  t(s,t)  *  0 
for  s  <  t. 


The  solution  to  equatioA  (7)  can  no*/  be  written 

0, 

x(t)  =  t(t,tQ)x(to)+  J  *(t,to+T+o)x(to+e)do 


(11a) 


'.mare 


5r(t,to+T+a)  -  t(t,to+Tfo)F2(tQ+T+a)+  f  *(t,t  +T+a)FJt  +T+a>*a-T+c)da  (lib) 

-T  5 

The  solution  to  the  process  x(t)  in  equation  (l)  is  defined  as 


x(t)  =  *(t,tQ)x(to)+  J°  t(tJ1j)+T+c)x(to+a)do  + 

t  t 

+  J  »Kt,o)B(o)u(o)do  +  r  >(t,o)D(cr)dir(o) 
to 


(12) 


where  ♦(t,*)  is  determined  by  aquation  (8)  with  F^(t,s)  ■  0  (null-matrix). 
One  observesthat  the  process  x(t)  is  Gaussian  since  x{5),  t  -t  <  5  5  t0> 
and  (dw(t)}  nj-e  Gauesian[i»]. 

Having  presented  the  preliminary  material,  the  optimization  problem 
can  now  be  solve  1  by  the  method  of  the  dynamic  programming. 


Solution  to  the  Stochastic  Control  Problem 
The  optimal  control  to  the  stochastic  system  with  the  time  delay  is 
determined  by  the  dynamic:  programming  method.  As  usual,  one  assumes  that 
the  system  starts  evolving  at  time  t€[tQ,T)  from  state  x(§),  t-T  <  5  <  t. 
The  minimum  value  of  the  functional  specified  in  equation  (6)  is  denoted 
by  vCx^t],  i.e,, 

VKt,t)  =  Min  Ez/fc  {  ^  C!Ws)!g(s)  +  !Ki(s)!g(tt)]dx} 
where  |!x(s)|^sj  =  x  (s)W(s)x(o). 


(13) 


The  first  ter""  or  the  right  of  equation  (lj.)  can  bo  expressed  as 


follows 


T  T  T 

Ez,t  ^  j  '*(s)'y  dsi  r-  r*^s);v(3)ds  +  «'  tr[w(s)p(c,O,0)]ds  (14) 

t  "t  V* 


where  x(s)  =  E  {v-(s)1 

2i  f  v 


r(3,0,0)  =  E^t  (U(s)  x(B))rx(£)  -  x(s)2#} 


(15a) 

(15b) 


If  the  states  of  a  system  evolve  in  time  according  to  equation  ('{), 

equation  (lla)  can  be  substituted  for  in  equation  (it)  (naturally, 

after  changing  t  to  s  and  t  to  t  in  equation  (lla)).  Then,  we  can  write 

T  0  0 

j  ^(s)*£,  yis  =  x'(t)R  (t)x(l)  +  x'(t)  /  ^(t.ojxft+cjdo  + 
t  '  '  -T 

.  0_  •  _  0  .0 

^  j  x  '(t+s)?^  (t,o)do  jx(t)  +J  t,  x '(t+ff)K0(t,C,a)x(t+a)d/da  (lo) 


where 


K0(t)  *  J*  v'(s,t)w(s)  ;(s,t)dc  ,  tc  <  t  <  T 


jL(t,cr)  -  ’•  ; ,(c,t)W(s)  v(s,t+‘s  fc)ds;  -t  <  a  <0  .  -t  <  a  <  0  (l’3) 

x  l  ~ 


K'(t,a,<r)  =  R2(t,o,a)  =  j*  V(s,t<  t*o)w(s)7(a,tt Tra)ds  (19) 

One  observes  that  equations  (lu),  (l?)  and  (iO)  at  tine  t  =■  T  become: 

K0(T)  =  0,  ii,  (T,o)  -  0,  Kg(T,U,o)  =  0  (20) 

for  -t  <  c  <  0  and  -t  <  <x  <  0. 

Suppose  now  that  x(t )  evolves  in  time  according  to  equation  (l), 
where  u(t)  =  u[Xt,t]  is  spec:. led  by 

i  1  0 

u[xt,t]  =*  -H'-LB,Ko(t)x(t)-K"x3/  j  K]L(t,c)x(t+c)do 


(21) 
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vhere  the  continuous  matrices  K^Ct)  and  K^(t,c)  are  to  be  determined.  3toen» 
x(t)  is  govened  by 

x(t)  a  /^(tjxft)  +  A^(t,)x(t-T)  +  B(t)u[xt>t]  (22) 

Now  it  follows  that  the  performance  criterion  can  be  expressed  as  (see 
appendix) 

£z,t  {  J  r^(s)!'^(s)+  ”u(. 

t 

T 

-  J  {l!*(8)!!w(s)  +  tr[w(s)p(s,0,0)]  +  l!u[xa,  s]'!^bj}  ds 
t 

■  Dx(t)!lJ  +  J  ^'^^(tjoJxCt+o)  «■  x'{t+e)K^(t,o)x(t)]  d a 
C  0 

+  1  r  x'(t*a)Kg(t,(j,a)jc(t+a)  dedi-2  +  S(t)  (23) 

-T  -T 

where  j(t),  K^t,*)  and  K^Ct,  a,  a)  ere  continuous  (nxn) -matrices, and  S(t) 
is  a  continuous  scalar  function  dependent  upon  the  covariance  of  the  plant 
noise.  The  functional  equations  (given  by  (fi6),  (A7)  and  (a5)  in  the 
appendix)  can  be  shown  to  be  equivalent  to  the  following  set  of  partial 
differential  equations: 

dK  (t)  . 

— =  -  /'(t)Ko(t)  -  Ko(t)^(t)  +  K0(t)BR"1B,K0(t)  -  ^(1,0) 

-  K^Ct.0)  -  W(t)  (21* ) 


c(s 


(s)j 


!  da  ’ 


3K.(t,o)  MC,(t,c)  , 

*  “IE -  “  ”  *-  Ko(t)BR"^K1(t ,o)  -  ^(t.O,©) 

(25) 


At 


&K-(t,o,c)  ^(1,0,0)  »K-(t,c  vo)  . 

~h - -TS - T3 - W 


Equations  (2k),  (25)  and  (?6)  can  be  derived  (after  sane  tedious  manipula¬ 
tions)  by  substituting  equation  (3.2 )  into  equation  (lk),  '•ollecting  the 
terms  for  Kc(‘)»  K^(*, •)  and  in  the  resulting  equation,  differ¬ 

entiating  their  expressions  and  making  use  of  equations  (21)  and  (22).  A 
more  detailed  discussion  is  given  in  the  appendix.  A  more  direct  derivation 
of  equations  (2k),  (25)  and  (26)  can  be  based  on  equation  (29),  which  is 
obtained  in  the  sequel. 

OTie  boundary  conditions  associated  with  equations  (2k),  (?5)  and  (26) 

are: 

K0(T>  -  £  ;  K^(T,o)  ■  0  ;  -  0  ;  K^'tjA^t)  =*  ^(t,  -1 ) 

*  A^(t)K1(t,e)  ;  Xg(t,-%o)  =  (27) 


Equation  (27)  can  be  verified  directly  from  the  defining  expressions 
of  Kc(t),  K x(t,o)  and  K^t^c). 

Mar  it  can  be  seen  that  u(x^.,t)  in  equation  (21)  is  the  optimal  control 
P0«  llamely,  one  obtains  for  t=*te  by  writing  I#{t,x,u)  for  the  integrand  in 
equation  (6)  *und  observing  v[x^,T}  =  0; 


*  *  **,t  J. 


dvRj.t] 


<  -  *tit  J  ( - 5^ -  +  L(t,x  )  -  I«(t,x,u)}  dt 

to 

T 

•  *,  .  j'  l[t,x,\x)  dt 
z,t 

o 


(28) 


lienee,  a  sufficient  condition  for  the  cptinality  is  determined  by  solving 


-Q~ 


Nlfl 

u 


{ 


5t 


(z?) 


where  the  expression  for  Vf . J  is  specified  by  equation  (23).  (In  equation 
(23)i  S(t)  depends  implicitly  on  the  plant  covariance  but  does  not  depend 
upon  u(.,,^L  Straight  forward  calculations  show  that  if  the  control  in 
equation  (21)  Is  chosen,  equations  (2S)  and  (29)  are  fulfilled.  Thus,  the 
optimal  control  u*(xt,t)  is  given  by  equation  (21).  it  Indicate*  that  (i) 
the  optimal  control  is  generated  linearly  by  the  expected  values  of  the  state 
variables  conditioned  cn  the  measurements ;  (ii)  the  feedback  gain*  are 
Independent  of  tbe  observations  and  can  be  computed  in  advance  ("off-line") 

The  estimates  of  the  state  variables  needed  are  the  optimal  estimates  that 
minimize  the  conditional  mean  of  t!ie  squared  estimation  error  [3]. 

Since  the  optimal  control  given  by  equation  (21)  is  deter¬ 

ministic,  the  optimal  control  and  tbe  optimal  filtering  can  be  solved 
independently.  Thus  the  separation  theorem  for  stochastic  linear  systems 
with  time  delay  is  established. 

Separate  v>  theorem:  For  stochastic  linear  systems  with  time  delay 
described  by  equation  (1)  through  (5),  the  optimal  control  that  minimize# 
performance  criterion  (5)  i3  specified  hy  equation  (2l).  The  optimal  feedback 
gains  and  the  optimal  estimates  x(t+-e)  «  Efx(bta)|z(p),  t  <  P  <  t»  t-f  < 
td-o  <  t)  car  be  determined  independently. 

One  observes  that  the  feedback  gains  in  equation  (21)  ere  deterministic, 
and  they  do  not  depend  upon  the  statistics  of  the  noises.  Moreover,  the 
eptimal  control  (21)  is  the  same  as  the  optimal  control  of  a  deterministic 
system  obtained  by  replacing  the  rand  cm  variable#  in  equations  (1),  (1)  and 
(6)  by  their  average  values;  i.e.,  the  certainty-equivalence  is  valid  for  th^ 
class  o*  problems  considered  here. 


-1.0- 


Aa  alternative  fornml  derivation  of  the  dynamic  progressing  equation  is 
presented  in  Appendix  B  by  using  the  principle  of  the  optimality. 

Optimal  Estimation 

In  order  to  apply  the  optimal  control  u  (x.j.,t)  in  equation  (21),  the 
optimal  estimates  x(C)  =  E^  {x(f)l  for  t-T  <  €  <  t  (the  conditional  mean 

"it  ~ 

value  of  x(?))must  he  generated.  When  the  system  moves  under  the  influence 

of  u  (x^,t),  the  plant  eq'iation  is 

dx(t)  »  ^(t^t)  +  A2(t)x(t-T)  -  BR_1B|_Ko(t)x(t)  + 

.0  , 

j'  K.  (t, o)x(t+o)dd  !  j  dt+D(t)  dv(t)  (30) 

-t 

and  the  observations  are  performed  according  to  equation  (4). 

The  optimal  estimation  equations  can  be  vritten  by  applying  the  results 
of  C53«  The  estimates  of  the  state  variables  that  minimize 

E  |[x(t+6)  -  x(t+9)  J  [_x(t  +  8)  -  x(t.+  9)j  | z (f },  t^  f  <  t|*-T  <  0  <  0  (’,1) 

are  determined  by  equations (52)  and  (33)  .(formally) 

-  A1(t)x(t)  +  A2(t)x(t-T)  -  BR"1B'LKo(t)x(t)  +  •  lt1(+.,o)x(t+c)QCJ  + 

-  G°{t.0,t)  [z(t)  -  C1(t)x(t)  -  C2(t)x(t-T)]  ;  9  =  0  (32) 

^+0i  +  ,  f°(t,9,t)  |*(t)  -  C-^tJxtt)  -  Cg(t)x(t-T) j 

-  T<eso  (35) 

The  gain  of  the  estimator  is  specified  by 

G°(t,e,t)  =.  [p(t,e,o)  c'(t)  +  P(t,e,-T)  c^(t)]  o£(t) 


(54) 


-n- 


wherfl  P(  •>•,•)  signifies  the  error  covariance 

P(t,9,o)  -•  E^t  •(  j^x(t.+9)-x(t+0;  j  £x(i +o)-x(t+o) j  j  ,-d  <0,  a  <  0 

(55) 

This  covariance  is  e-overned  by  the  '‘ollowii^  set  of  partial  differential 
equations 

~P%0'  ^  “  A1(t)P(t,0,0)  +  A2(t)P(t,r,o)  +  P(t,0,0)  A^(t) 

+  P*(t ,0, T)Ag(t)  +  ^(t)  -  P(t>0,0)C1'(t)<^1(t)C1(t)p(t,0,0) 

-  P(t1010)C^(t)Q21(t)C2P(t,T,0)  -  P(t,01T)C2(t)Q^1(t)C1(t)P(t,0,C) 

-  PCt.O.T^CtjQ^tK^tWt^o)  (36) 


dp(t^0,o)  +  =  ^(tjPftjO,*)  +  A2(t)P(t,T/o) 

-P(t,0,0)C^(t)Q‘1(t)C1(t)P(t,0,o)-P(t,0,0)C1,(t)<i21C2(t)P(t,T,a) 
-P(t,0,T)C2(t)q21(t)C1(t)p(t,0,o)-P(t,0,T)C2(t)Q^1(t)  • 

(^7) 


C2(t)P(t,T,o) 


ap(t^e,c)  +  teltjtjsl  r  feCt^o)  s  _P(t> e,0)c^(t)o^1c1(t)p(t, . 

-P(t,3,0)C:[(t)<l'1C2(t)P(tfTf<r)-P(tfe,T)C^(t)^1C1(t)P(t,0,a) 


-p(t,e,T)c2(t)Q“1c2(t)p(t,T,o)  (%) 


Equations  (32)  through  (33)  establish  the  solution  to  the  optimal 
estimation.  Equations  (?6)  through  (38)  are  independent  of  tne  observations, 
and  can  be  computed  "off-line".  Tbs  co;m;utational  difficulties  involved  arc 
presently  being  explored. 


I 
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The  equrMons  for  the  solution  to  the  combined  problem  of  the  optimal 

CC'.trd  oad  estimation  is  now  furnished.  The  block  diagram  in  figure  1 

displays  tar  optim.’i.  sy^-em.  The  main  difficulty  in  the  "on-line"  irrpler.en- 

,  o 

tation  is  due  to  i.A  realization  of  the  term  -BR  B#(t)J  K1(t>o)x(t*‘C)der> 
■which  requires  the  s  jlutiun  of  the  optimal  smoothing  as  well.  As  s  first 
approximation,  the  integral  c&n  be  rep]  aeed  by  a  finite  sum.  Then  the 
designer  can  use  e.  finite  number  of  controllers  to  operate  on  the  optimal 
smoothed  estimates,  which  cen  be  computed  {by  means  of  a  f  xed-lag  smooth¬ 
ing  procedure).  The  p-oblem  of  itnplementing  the  optimal  solution  for  time 

delay  system!  is  .ovxrently  being  investigated. 

! 

Conclusions 

The  optimal  feedback  control  is  determined  for  stochastic  line  -  - 
systems  with  time-invariant  tine  delay  so  that  the  average  value  of  a 
quadratic  coat  fui!,tional  is  minimized.  The  optimal  control  depends  linearly 
on  the  expected  values  of  the  state  variables  conditioned  on  the  measure¬ 
ments.  They  are  the  best  estimates  of  the  states  which  result 
in  the  minimum  of  the  estimation  error.  The  optimal  feedback  control 
and  the  optimal  estimates  can  be  determined  independently. 
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Appendix  A 

Derivation  of  Bquations(23 )  through  (26) 

Suppose  that  systems  (l)  and  (22)  move  unler  the  influence  of  the  control 

-I  .  ,  0 

ufxt,t]  -  -  R"Vk  (t)  x(t)  -  R"V  j  K  (t,o)  x(t+p)  do  (Al) 

-T  ^ 

xhere  Kc(t)  and  K-^t.o)  are  gain  matrices  to  be  determined.  The  task  is 
now  to  express  V[xt» t]  in  equation  (13)  In  terms  of  x(?),  t-T  <  5  <  t,  the 
initial  function.  One  observes  first  that  the  solution,  x(s)  at  time  a  >  t, 
to  equation  (22)  can  be  written  by  means  of  equations  (lla)  and  (lib). 

JO  _ 

x(s)  -  t(s,  t)x(t)  +  j*  t(s,t+T+e)x(t**)  do  (A2) 

-  T 

when  ♦  (.,.)  is  specified  by 

~  0 

*(s,  t+T+o)  =  ♦  ( s,  t+  T+p  )a^  ( t+  t+o  )  -  J  f(n,  t^T+p*-a)BR'1B'(t+t+ofa)  * 

-T-O 

K1(tt-T+o+-OE;-«-T  )  da  (A3) 

Before  substituting  x(s)  into  equation  (13).  relation  (it)  i«  used 
in  equation  (13).  The  resulting  equation  can  be  rewritten  by  means  of 
equation  (A2): 

T 

VfVt]  *  J  {x'<0[w(fihKo(s)BR"V(s)Ko(a)]  x(B)  ♦ 

t 

,  0 

x#(s)K0(s)BR  tf'Cs)  J*  K^-( -,o)x(s^o)  do  +■ 

0  .  (At) 

J*  x#(s+o)K^(s,o)dc  BR'iB/(s)K0(s)5(s)  + 

0  0  - 

J*  J*  )K^(s,e)BR_iB/(s)K1(s, o)x(»fcr)doda  +  tr[W(s)P(s,Q0)l1 

—  T  m%  “*■  J 


as 


Bow  equation  (Ai)  can  be  expressed  solely  In  terms  of  x(t-t-£),  -t  <  r,  <  o, 
by  substituting  equation  (A?)  Into  ( A1* ) *  It  ’•egiU.ts  in 

... 

V[xt,t]  -  x'(t)Xo(t)x(t)  +  J  I  x/(t)K1(t,c)x(tKTKx/(t+cr)K'(t,q)x(t)Jl  da 
0  0 

+  J  j  x/(t>c)Kg(t,(?,rr)x(-‘>a)tlad  ^  (A j?) 

where  K0(t),K1(t,b)  and  ^(t.a.a)  are  specified  by 
T 

Ke(t)  -  f  ds  {t'(8,t)[v<s>Ko(s)BR"V(r0Ko(L.)] 

b 

n 

+  t,(s,t)KQ(s)BR"1B#(s)  j*  Kx(s,a)  *(e«-c«,t)  da 

+  J*0  ♦#(et-a,t)K{(s,a)BR‘1B/(s)K0(t.O  *(s,t)  da 

0  0  -  •, 

+  J*  J  ♦'(e»-a,t)!d(s,<7)BR ‘V(s)X1(s,a)i(»co^t)dadoj  (A 6) 

-T  -T  *  J 

T  ^ 

^(t.a)  •  J  ds  {♦/(s,t)[w(8>fKo(s)BH‘:'B#(s)K0(s)l  ♦{s,t+r+ff)f«#(i:,i)Ko(‘ 

t 

0  0 

BR"V(3)  f  K-(s,  0)  **(8<-O,tfT4-a)d0  +  /  »'(RfC,t)K/(s,!:)  d?  • 

-t  -t 

^  0  ° 

BR’1B'(s)K0(s)  7(s,t+T+a)  +  J  £  */(9t-r,t)ldL(s,  ?)  • 

BlfS'fejKjU, a)  7(8+0, t+T+a)  d^daj  (AT) 

T 

^(t^a)  «  J  ds  {t(s,^T>c)[w(8>Ko(s)BR'1B/(6)K0(s)j  *(s,t+T+a)  + 
t 

~  ,  ■  °  ^ 
t/(a,t*~+e)K0(s)BR~'uB'(s)  J  K^s,  p)'*(sf)»,ttT+Qf)  dp  + 


J  'f#(w  P»‘t+Trcr)K1'(s,  0)d3  BR_:''B,(g)K0(s)T(s, t+-f+a) 


4- 
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4  J*  J'7#(^C,ttT+ff)K{(8>?)B?';i3/(8)K1(8,$)‘y(^p,t+T+Q)<J5d3} 

(A8) 

where  "y(«,  t+T^  a)  is  defined  by  equation  (A3),  and  t(s,t)  a&tielfea  equations 

T 

(9)  and  (10)  with  respect  to  t.  Moreover,  s(t)  »  tr  J  W(s)p(s,  0, 0)da. 
Equations  (a6),  (A7)  end  (A0)  yield  directly  that. 


Rjfo-T)  -  K0(t)Ag(t)  (A9) 

K^t.O.Or)  .  X£(t,0t9)  ;  -T  <  <j  <  0  (A10) 

^(tjo, -t)  »  K^(t,<r)Aj,(t)  ;  -t  <  a  <  0  (All) 

Kq(T)  -  0  (null -matrix)  (A12) 

^(T.o)  «  0  (A13) 

-  0  (A14) 


Equations  (A9)  through  (Al1*)  establish  equotioos  given  by  (27). 

If  equation  (A6)  is  differentiated  with  respect  to  t,  one  can  obtain 
equation  (2*4 )  by  making  use  of  equations  (9),  (1C)  and  equations  (a6)  and 
(A?).  Moreover,  if  equation  (A7)  is  used  to  generate  -  ?K^/dc,  one 

obtains  equation  (25)  by  means  of  equations  (9),  (10)  and  (a8).  Similarly, 
equation  (a8)  yields  equation  (26)  after  sane  tedious  manipulations. 

It  ia  noted  that  equations  (2t)  through  (27)  can  also  be  obtained 
directly  by  means  of  equation  (29)* 


Best  Available  Copy 
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Appendix  B 

Formal  Derivation  of  Equation  (29) 


A  formal  derivation  for  the  optimality  condition  (29)  is  presented  here. 

% 

To  *.pply  the  dynamic  programming,  let  the  minimum  value  of  the  performance 
index  (FI)  for  the  process  starting  at  time  t  from  state  x(?;),  t-d  <  F  <  t 
c-2  denoted  by 


T 

V  .  Min  EI(t{  J  [Ikt)l5(t)  ♦  i|u(t)|||(t)]  dt  } 

t 


»  Min  S  . 

T>  a  U 

r 


T 

L[t,  x,  u]  dt  j- 

J 


(Bl) 


■s/here  f|*||  signifies  a  Euclidean  norm;  x^.  emphasizes  the  functional  dependence 
of  V®  x;  and  E£  °  E{»|t(t),t}  represents  a  conditional  expectation 
operation. 

Sbe  application  of  the  principle  of  optimality  leads  to 

t+A 

7  [V*]  "  14111  Ez,t  {  J  L(t,x,u)dt  +  V  [(xfAx)VA,t«-A]  }  (B2 ) 

U  t 


•where  x  »  Ax(tvcy),  -r  <  o’  <  0  is  a  smooth  displacement  about  some 

given  r(t)  corresponding  to  a  fixed  control  n ;  and  the  conditional  expecta¬ 
tion  operation  is  performed  on  x  tinder  the  condition  that  Ax  is  determinate. 
It  new  follows  that  for  seme  sample  functions  of  the  x-procesn 


“  [<*-4*Wtf4]  ■ v  [v‘] 


,  dV'A  .  1  d  V  .2  ,  ..3. 

+  dt  A  *  2  7T  A  +  °(*  > 
dt 


(B3) 


Ssod^sdJiaasa  of  Y£»,, .  ]  «cr©  eganasod  to  be  bounded; 
u?/dt  is  tha  total  derivative  of  vf.,,1  with  respect  to  t  evaluated 
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alfltie  •  Ma^lft  trajectory  x(t)  for  bom  fixed  control;  similarly,  d^/dt2; 

3  2 

11*  ®(4  )/A  "Cud  approaches  to  term*  The  expectation  (the  ensemble 
mngi)  la  to  he  taken  ever  ell  x-traj  ectoriea . 

•ia«tl«ns  (13)  i*  bow  eubstltvted  Into  the  right  tide  of  equation  (12). 
Since  7[x^,t]  does  not  depend  explicitly  an  u(. ),  it  can  be  taken  outside 
the  brecei  in  the  resulting  equation;  it  then  cancels  the  tens  an  the  left 
side*  One  thus  obtains 

0  “  +  4  S +  r  *  °^3)}  <b*») 

“  dt 

In  order  to  denonstrate  the  isplicatien  of  equation  (Bl)  and  of  the 
functional  dependence  of  V  upon  x,  the  problem  Is  considered  in  which  the 
plant  is  linear  with  additive  noise  terns  and  the  ensemble  average  of  the 
performance  criterion  is  quadratic  in  control  and  state  variables.  In  such 
*  ct**>  *he  functional  expression  of  V[.,.  ]  can  be  written  as 

r 

r  9  0  0 

Vfx^t]  -  K^t  {||x(t)|fp^tj  +  x'(t)  J  PjCt^cJx^oJdo  J  x*(t*v)  • 

-T  -T 

_  .  (15) 

0  0 

P^(tfff)do  x(t)  +  J  J  x(t>*-o)P2(t.<»,at)x(bfa)do  da>  s(t)| 

-T  -T 

**tw*  and  Pg(t,o,a)  are  matrices  to  be  determlmd;  s(t)  is 

a  soalaxv  lew, the  «wms«lop  of  d?/4t  for  a  sample  function  and  sene 
dtenpkilatlc  a*(t)  aan  be  written 


Ax*(t)  PQ(t)x(t)  +  x<(t)P0(t)A*(t)  +  Ax'(t)  J  P^(t,or)x(t4-a)d£r 
0  0 

♦  x#(t)  ^  P^CtfOrjAx^eJdff  +  J*  Ax'(t+c)p£(t,e)do  x(t)  + 

0  0  0 

♦  J  x'(t>«-«)P/(t,o)dcr  Ax(t)  +  J  J*  ["Ax'(t*o)P2(t,c,a)x(t+a) 

-T  A  -1  -1  “ 

♦  x{bfo)Pg(t,o,  a)Ax(b*a)j  do  da:  +  s(t)  a}  (b6)* 

Similarly,  dS/dt2  era  be  obtained. 

It  la  new  obaerred  that  Ax(t)  is  actually  a  randan  variable.  In  order 
to  determine  tbs  expression  in  the  braces  of  equation  (Bh),  both  A  dV/dt 
and  A2  dZV/dt2  must  be  averaged  (rateable)  over  all  possible  Ax(fc*-?)#  -*  < 

5  <  O.  Vov,  *r(t)  sad  Ax(t)  are  Qeaaslra  processes*  Moreover,  Ax(t)  evolves 
la  tine  according  to  equation  (17): 

Ax(t)  -  [^(t)  ♦  AgXft-T)  +  Bu(t)]A  *  D(t)  <hr(t)  (B7) 

fits  acrerage  value  of  Ax(t)  for  a  given  ample  function  x(t)  evolves  in  tine 
according  to 

Ax( t)  •  x(t’)  +  Ag  x(t-r)  ♦  Bu(t)]  A  (B8) 

aince  }x(t)J}  where  X  aiguifias  the  enaeable  average  with 

respect  to  Ax* 

It  felloes  that  the  probability  density  fractions  p( . )  of  dr  and  Ax  are 
p(dsr(t))  -  const  exp  [-  \  (dw/O^Ar^dir)]  (B9) 

*In  this  appendix,  Ax(t)  is  written  for  dx(t). 


P  [dx(t»*)|*(t),x(t>«-c),t,  -t  <  o  <  o]  -  (BIO) 

»  const.  exp  i  [ix(tfcr)  -  Ax(t+o)]  If1/ A  [Ax(t*e)  -  Ax(W-d)]  J 

where  |H|2  ^  £  -  DfbtojQjD'ft't-e). 

E* 

Since  the  probability  density  functions  of  dv  end  Ax  ere  known,  the 
enssable  average  (with  respect  to  Ax)  of  the  expression  in  equation  (B1*) 
can  he  evaluated  by  mans  of  equations  (E5)  through  (BU): 

BVfr^t) 

v t  \tt(t,a^u)  +  a -  +  V^t‘T^ +  A  * 

C  0 

[p0(t)x(t)  ♦  /  P^t.cMt+oJdd]  +  [x'(t)Po(t)  ♦  J*  x'(t*-o)p'(t,q)do]  * 

-T  -T 

0 

E^t  lA1x(t)vA2x(t-T)  ♦  Bu(t)J  A  ♦  J  |"x#(t)P1(t,flf)A3t(tfff)  + 

-T  “ 

—  0  0  n 

Ax#(fc*-e)P^(t,o)x(t)j  (Jo  ♦  J*  J  [Ax'(t«-c)P2(t,o,a)x(t+a)  +•  x#(t*o)  • 

-t  _t 

P2(t,o,a)Z5>a)]  do  dcr+  |  tr  [P0(t)liQ1D/(t)f  ip^t, 0)DQjD' (t)  ♦ 

0 

§  P'(t, 0)0^0* (t)  +  J  P2(t,o,o)D(tfo)Q1D'(t>*‘o)do]+  A  S(t)  ♦  o(A2)}  (Bll) 

depression  (Bll)  can  now  be  substituted  into  equation  (B^)«  Dividing 
through  by  A  in  the  resulting  equation  and  letting  A  approach  to  zero,  one 


obtains: 
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0  •  Min  lt  t  x,u)  +  +  [^(t)  ♦  AgXCt-T)  +  Bu(t)J  [P0(t)x(t)  + 

0  0  - 

J  PjfooJx^oJdo]  +  [*#(t)Pc(t)  +  J  x'(-t*o)p'(t,o)dff]  [AjXOl)  +  AgXft-T)  + 

-T  -T 

*u(t)]  +  Urn  J*  [x#(t)P1(t,e)  ^t2i+  sd*al  P'(t,o)x(t)]  do  + 

+  11*  J  J  [  — P2(t,e,a)x(t*a)  +  x'(t+cr)Pg(t,o,a)  ^~2l]  do  da  + 

0 

♦  8(t>J  ti|(2Po(t>f-P1(t,0)4-P'(t,0))  DQ1D/(t)+  2J  P^^ffjD^D'Ox-crJdo]  } 

-T 

(B12) 

It  la  noted  that  in  expression  (B12),  the  tern  Ax(* )  is  given  by  equation 
(B8).  Since  K[Axfz(t)]  -  B{E  [Axfz(t),  x(t),  tQ  <  t]  1  it  follows  that 
lhBjS[S|z(t),x(t),  tQ  <  t]/&  »  dx(*)/dt  exists  for  all  sample  functions 
of  the  x(* )  process. 

Since  8x(b*-c)/St  »  &x(t+o)/So>  an  Integration  by  parts  in  equation 
(B13)  leads  to 


Min  E%  t  {x'(t)Wx(t)  +  u'(t)Ru(t)  +  —  +  [^(tV-^xft-TKBuCt)]'  • 

0  0 
[t)x(t)  +  J*  P1(t,c)x(tfo)der]  +  fx#(t)Fo(t)  +  J*  x/(t+o)P1#(t,o)do]  • 

•T  “T 

n  o  8P.(t,o)  ap.'(t,o) 

t(t)*'AgX(t-T)fBu(t) J  +  J  I  x'(t)  . —  x(t+sHx' (t*or)  - x(t)jdo  + 

-T  “  So  So 

l  r  .  SPg(t ,o,a)  SP  (t,o,<*) 

x'(bfp)  — ■  ■■  —  x(t«-a)*x'(t*-c)  -  x(t+a)  do  da  ♦ 

■t  So  sa  J 
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+  x#(t)P1(t,0)x(t)  ♦  x'(t)P'(t,0)x(t)  -  x'ftJP^-TjxCt-t) 

0 

*  (t-T)P^(t, -T)x(t)  +  J  £x#(t)Pg(t, 0,a)x(tKx)  .  x#(t-T)Pg(t»-tf a)x(t+a)Jdct 

or 

+  /  Lx#0>fff)P2(t/C,0)x(t)  -  x'(tH.ff)P2(t,(,,.T)x(t-T)]  io  ♦ 

Q 

8(t)+Jtr[^Po(t>Pi(t,0>P^t,0))DQ1D#(t)+2]:  P2(t,«y,<f)DQ1D#(tf<y)<te]  } 

(B13) 

la  equation  (813),  the  terme  8(t)  and  tr(*]  do  not  day end  open  the 
control.  Hence,  the  optimal  control  can  be  obtained  from  equation  (B13) 
by  expressing  the  control  u  aa  a  complete  square.  When  the  optimal  control 
u*  is  then  substituted  back  Into  equation  (B12).  equations  for  determining 
PQ(t), P^(t,o)  and  Pg(t,<r,a)  are  obtained. 
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ot  CERTAimT- niUIVALBiCK  AMD  CSRIAtHTT-DIFFERKNCE 
2H  STOCHASTIC  LINEAR  3TS7EHS  WITH  TIME  OXLAI 

A.  J.  Koivo 
Purdue  University 
Lafayette,  Indiana 

The  optimization  of  stochastic  linear  ayatena  with  tine  delay  ia 
presented  in  the  framework  of  the  dynamic  programming  method.  A  sufficient 
condition  for  -lie  optiimlity  ia  obtained  toy  applying  the  principle  of 
optimality.  An  example  ia  presented  to  demonstrate  that  the  certainty- 
equivalence  is  valid  in  ipt inlying  a  class  of  stochastic  linear  systems 
with  time  delay.  Its  validity,  however,  i3  quite  restricted.  Another 
exusple  in  which  tha  variance  of  the  additive  ncii>e  in  the  differential- 
difference  equation  depends  upon  the  control  illustrates  that  a  eertainty- 
dlffarence  nu_y  be  as  well  encountered. 

Introduction 

It  is  feasible  that  the  optimal  solution  of  a  stochastic  dynamical 
system  agrees  vitlj  that  of  the  deterministic  dynamical  system  obtained  toy 
formally  replacing  the  randan  variables  in  the  stochastic  system  by  their 
expected  values.  Such  s  coincidence  is  usually  called  certainty  equivalence 
[1,2].  It  ia  known  to  hold,  f<r  example,  if  (i)  the  plant  dynamics  is 
described  by  a  stochastic  linear  (ordinary)  differential  equation  containing  kfi 
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CK-1970  and  in  part  by  Air  Force  Office  of  Scientific  Research  Grant  No. 
69-1776. 
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addltlve  Oeussien  noise;  (li)  the  covmrience  of  the  plant  noise  is 
Independent  of  the  control  and  state  variables;  and  (ill)  the  expected 
value  of  performance  criterion  to  be  minimized  is  a  quadr  tic  functional 
in  cont.ol  and  states.  Hcprever,  the  certainty  equivalence  is  not  valid 
in  stochastic  linear  (ordinary)  differential  equations  if  the  covariance 
of  the  plant  noise  depends  upon  the  control.  In  fact,  in  such  a  case, 
a  certainty-difference  is  encountered  [1]. 

The  purpose  here  is  first  to  present  the  optimization  of  stochastic 
linear  systems  with  time  delay  in  the  framework  of  the  dynamic  programming 

J»  * 

method.  Then,  it  is  demonstrated  that  the  principle  of  certainty- 

H  » 

equivalence  as  well  as  the  principle  of  certainty-difference  are 
encountered  also  in  optimizing  linear  stochastic  systems  with  time  delay. 

F.oblem  Statement 

The  state  transition  of  a  system  is  governed  by  a  stochastic  scalar 
(for  convenience)  differential-difference  equation 

dx(t)  -  [a^tJxCt)  +  ^(tMt-f)  +  b(t)u(t)]dt  +  *^(t)d?(t)  (l) 

where  t  €  [t^T],  T  <co  $  x(t)  represents  the  system  state  at  time  t  and 
x(t-t)  at  time  t-T  (time-lag  t  »  const);  a^(t),  »2(t),  a^(t)  and  b(t)  are 
bounded  c'-functlons  of  t;  ?(t)  is  a  Brownian  motion  process, 
the  observation  equation  is 

d*(t)  -  [c^tjxft)  +  c2(t)x(t-f)]dt  +  d-n(t)  (2) 

share  c^(t)  and  c2(t)  are  bounded  and  continuous  in  t  and  T)(t)  is  a 

Brownian  motion  process. 


m 


BBS 


•zfc 

The  Initial  function  for  aquation  (1)  la  determined  by  a  Gabselan 


process  specified  by 

BCx(t0+o)]  -  £(t0+cr),  -r  <  e,  6  <  0  (3) 

SK*(t0+ff)  -  x(to+ff))  (x(te+d)  -  x(Vfi))l  «  C(tj9,  6)  (M 

Ths  ?(t)  and  H(t)  processes  a ra  sere-mean  procaaaea  with  variances, 
respectively, 

Et?(t1)?(tg)]  -  Q^Wtj-tg)  (5) 

EfTlC^JtKtg)]  -  (6) 

where  )  and  Qg(. )  are  given. 


It  ia  also  assumed  that  x(t),  t  €[to-T,  t0)  ?(t)^and  T|(t),  t  €[t^T] 
represent  independent  random  processes. 

The  problem  is  to  determine  the  deterministic  optimal  control  u  (t,  £) 
so  as  to  minimite 

T 

JCu]  •  *s  t  {  J  fw(c)x2(o)  +  r(o)u2(o)]daj’  (7) 

*o 

where  T  is  given,  w(cr)  and  r(o)  are  positive  for  all  o,  and  I  .{. }  =» 

E{. (s(t),  tQ  <  t  <  t]  signifies  the  conditional  expectation. 

The  optimal  control  problem  posed  ia  solved  by  the  application  of 
Bellman's  principle  ef  optimality.  Sufficient  equations  for  the  optimality 
are  first  presented.  Then,  these  equations  are  applied  to  two  examples. 

The  one  demonstrates  a  case  in  which  the7 principle* of  certainty- 
equivalence  is  valid;  the  other  illustrates  certainty-difference  in ' 
stochastic  systems  with  time  delay.  For  the  former  cate,  it  is  shown  that 
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the  resulting  optiaal  control  la  the  aux  as  the  on*  obtained  in -the 
deterministic  problea  by  minimising 


■  J  tv(c)i2(c)  +  r(c)u2(c)]  6a 


SUbJSct  to  the  cooatralnt 

.  5(t)  -  a1(t)i(t)  >  a^(t)5t(t-d)  +  b(t)u(t),  t  Cft^T]  (9) 

where  5(t)  -  ^  t(*(t)}.  Equations  (8)  and  (9)  have  been  obtained  by 
substituting  the  aean  values  in  equation  (l)  and  (7)  for  the  randan 
variables.  For  the  latter  ease,  it  Is  shown  that, if  the  variance  Q^(. ) 
in  equation  (5)  depends  upon  the  control  u(.),  the  optiaal  control  that 
srtnlnlaes  (7)  subject  to  the  constraint  equation  (1)  is  not  the  same  as 
tbs  optiaal  control  that  minimises  (8)  subject  to  the  constraint  equation 

(9). 

Preliminary  Material 

Sene  introductory  aaterial  on  the  linear  differential-difference 
equation  is  first  presented  [2,31*  When  the  state  of  the  system  is 
governed  by  the  differential-difference  equation  (l),  the  evolution  of  the 

proceas  x(t),  t  >  tQ  is  defined  by 

0 

x(t)  •  ♦(t,te)x(to)  +  J  t(t,to+x+a)*2(tji+a)x(t0+a)6a 


J  t(t,a)b(a)u(a)da  t-  J  .  ♦(t,a)a3(a)d?(a) 


where  f(t,t  ),  t  >  t  is  the  solution  to  the  honogenou*  part  [u(t)  »  0] 
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of  equation  (9)  with  the  boundary  condition  t)  ■  1  and  *(t,  a)  -  0, 
if  t  <  a. 

If  u(.)  it  assumed  to  ba  linear  in  the  atata  variable,  then  equation 
(10)  can  ba  used  to  dem  artrate  that  the  functional  in  equation  (7)  la 
of  the  fora 

0 

Jfu]  «  P0(to)x2(t0).*  2x(tQ)  J  p1(V^*''V‘^dff  * 

-T 

0  0 

J  J  5(to+^)P2(ttfwfci)*(t0+a)d»da  +  S(to)  (11) 

~X  -T 

where  P  (t  ),  p  (t  ,  o)  and  P»(t  .c,a)  and  S(t  )  are  independent  of  x.  the 
fora  of  equation  (ll)  will  be  used  in  the  sequel  to  determine  the  opt Inal 
solution.  Also,  the  equations  specifying  PQ(t),  P^ftjo),  sad  Pg(t,e,  a) 
will  then  be  obtained. 

Solution  by  Keans  of  Dynamic  Progrtasaing 

To  determine  the  solution  to  the  stochastic  optima*  control  problem, 
the  principle  of  optimality  is  applied.  Vhen  the  system  star+a  at  time  t 
from  state  xfo-c),  -t  <  o  <  0,  the  min  loam  value  of  the  return-function  is 
denoted  by 

T 

V[xt» t)  *  Min  {  J  (w(o)x2(o)  +  r(o)u2(o)]do}  (12) 

t 

By  the  principle  of  optimality,  one  obtains 

t+A 

V[xt,t]  «  Min  t  |  J  fvx2(c)  +  ru2(<r))do  +  V[x(t+A)'tf &, t+A)}  (13) 


where  A  represents  a  snail  tine- Increment.  (The  expression  in  the  braces 


-**- , 

co  tlM  sight  of  equation  (13)  1*  expended  about  the  function  representing 
the  evolution  of  the  seen  value  x  of  the  x-process.  The  term  V[x  ,t) 
ippeire  sow  on  both  sides  of  the  resulting  equation.  Since  it  is  Independent 
Open  the  control.  It  can  be  canceled.  By  assuming  that  the  third  partial 
derivatives  of  ?[.,.]  are  bounded,  and  dividing  through  by  A,  one  obtains 

0  «*  Wn  K  .  |int2(t)  +  ru2(t)  +  \jrr)  +  i  (— A  +  o(A)j  (ll) 

ehere  11*  o(a)/A  ■  0  as  A  approaches  sero.  Equation  (14)  is  the  basic 
equation  to  be  used  in  solving  the  stochastic  control  problem. 

In  the  specific  case  where  the  plant  is  linear  and  the  return  function 
quadratic  in  state  variables  and  control,  VLxt>  t)  is  sa raced  to  have  the 
font  presented  in  equation  (11),  Where  tQ  is  replaced  by  t  .and  PQ(t), 

Fj (t,o),  and  P2(t,o,a)  are  matrices  to  be  determined.  This  expression  of 
V[ . , . ]  Is  now  substituted  into  equation  (14).  Forming  dv/dt  and  d2v/dt2 
about  the  average  trajectory  x(t),  substituting  dx(t)  from  equation  (l), 
performing  appropriate  expectations  in  the  resulting  expressions  relative 
first  to  dx  (or  d§)  and  then  to  x  given  a(t),  to  <  t,  one  obtains 

0  *  Kin  *x2(t)  +  wc(t,  0, 0)  +  ru2(t)  +  +  2[&^x(t)fa2x(t-T)fbn] 

Q 

0  o 

[Po(t)i(t>  J  P1(t,o)x(tt-<r)do]  +  2  j  i(t)P1(t,c)  do  + 

°° 

+  J  do  J  da  [  P2(t,o,a)i(tfa)  +  x(t+o)P2(t,c.a)  SsigjSi] 

•T  *T 

0  . 

♦  F  +  \  P2(t,o,o)s2(tfo)Q1do} 


(15) 
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Heace,  the  optimal  control  la  determined  by  minimising  the  deterministic 
ezprtialon  la  the  braces  of  equation  (15).  the  solution*  to  two  different 
optimal  control  problem*  ere  nov  given;  the  details  can  be  found  in 
the  appendix.  In  the  first  cue,  the  variance  Q.  of  the  plant  noise  is 

a  • 

constant;  in  the  second  one,  the  same  variance  is  dependent  upon  the  control. 


Certainty-  Equivalence 

Suppose  that  the  variance  ft,  of  the  plant  nolee  ie  constant. 

Equation  (15)  deteraines  new  the  optimal  control* 

0 

0*0^)  -  -  |  b(t)[Po(t)x(t)  +  J  P1(t,o)x(b*o)dff]  (16) 

Hence,  the  optimal  control  ie  linear  in  the  estimated  value  x  of  the  state 
variable.  Substituting  equation  (16)  back  into  equation  (15),  equations 
for  determining  Pe(t),  P1(t,o),  and  P2(t,o,a)  reeult. 

tfQ(t)  #  b2/t>  2 

-35—  +  2a1(t)PQ(t)  *  2P1(t,0)  -  P?(t)  +  w(t)  -  0  (17) 

ap,(t,®)  aP,(t,o)  b2,  v 

-Tl - i5^“*  Po(t)P1(t,o>P2(t,0,0)  .  o 

(18) 

dP.(t,o,a)  dp  (t,o,a)  3P,(t, a, a)  .Z.  . 

-Tt - T5 - T3 - ■  0  <»J 

0 

S(t)  +  w(t)c(t,o;o)  +  |  [PjtHP^O)]^^)^  +  |J  P2(t,c,cr)- 


z. 


e^t+cj^do  -  o 


(20) 
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Pq(T)  *  £  »  ■  0  j  Pg (T, C,  or)  *<  0  ;  S(T)  »  0  ;  -x  <  g  <  0 

«g(t)Pe(t)  -  Px(t,-T)  -  0  j  -t  <  n  <  o  ^?1- 

•2(t)P1(t,cr)  -  P2(t,-T,cr)  -  0 

« 

It  it  aeq»haai*ed  that  the  realization  of  the  optimal  control  requires 
the  determination  of  the  estimates  x(t+e)  *  E  .  [x(t+o)l,  *t  <  a  <  0  of 
the  state  variables*  The  expected  values  of  the  state  variables  conditioned 
od  the  measurements  furnish  the  minimum  for  the  average  value  of  the  integrated 
estimation  error  squared*  they  can  be  obtained  by  the  application  of  the 
results  on  the  optimal  filtering  in  linear  stochastic  systems  vith  time 
delay  [6].  The  feedback  gains  in  equation  (16)  can  be  computed  "off-line" 
before  the  application  of  the  optimal  control.  Moreover,  they  are  not 
functions  of  the  observations  z(t)  or  of  the  state  variables  x(t). 
Consequently,  the  determination  of  the  optimal  control  is  independent  of 
the  optimal  estimation.  This  statement  is  knovn  as  a  "separation  theorem” 
in  conjunction  vith  stochastic  optimum  control  problems. 

If  the  optimum  control  problem  described  by  system  (8)  and  (9)  obtained 
by  substituting  only  mean  values  for  the  random  variables  la  solved  ,  the 
resulting  optimal  control  is  exactly  the  same  as  the  one  given  in  equation 
(16).  In  fact,  the  feedback  gains  are  also  specified  by  the  same  equations 
((17)  through  (21)).  ltois  coincidence  la  usually  termed  the  "principle" 
of  certainty  equivalence,  vhich  is  here  established  for  a  class  of  stochastic 
systems  vith  time  delay. 


Certainty- Difference 

Suppose  the  variance  Q,  of  the  plant  noise  la  dependent  upon  the  control, 

2 

say,  Q1»2q1u  ,  vfaere  q^  is  a  constant.  This  situation  occurs,  for  exasqAe, 
if  the  randoatoeas  in  the  plant  is  produced  by  the  application  of  the  control. 

a 

Hence,  the  randon  variable  in  the  plant  equation  ia  aero  if  the  control  it  aero. 

9 

The  opt  Inal  control  is  again  determined  by  equation  (15).  If  <J,»2qjU  ia 
substituted  into  equation  (15),  the  minimisation  operation  yields  for  the 


optimal  control  Q 

•  _  4  PQ(t)x(t)  +  J*  P1(t,a)*(tfa)da 

u*(t,i.)  -  -  b(t)  -2 - - - -  ; 

t  +  (Po(t>-P1(t,0)]a^(t)q1 

u*(t«-o,it)  -  O,  -r  <  a  <  0  (22) 

where  £(t+o)  «  B_  .  facCt^er} }.  It  ia  noted  that  r  ia  positive  definite;  P  (t) 
and  P^tfO)  are  nonnegative  for  t  fift^T);  and  P2(t,er,«r)  is  nonnegative  [5] 
in  the  domain  of  its  definition.  Substituting  equation  (22)  back 
into  equation  (15),  equations  for  determining  Po(t),  Pj^t.o)  and  P2(t,c, a)  result, 
d.?  (t)  b*(t)l^(t)  r 

-ST“  +  ♦  W  -  --  tfy  -  [2  -  -  0  (23) 

ft  ft 


where  r&(t)  •  r^MP^t^-P^t,  0)3  aZ(t)qx; 

»P.(t,o)  aP.(t,o)  bZ(t)P  (t)P  (t,c)  r 

-he - T5— *  - 

ft  ft 

m 


aPg(t,c,a)  SP2(t  ,<j,a)  aPg(t,o,  a)  bz(t)P1(t.  a)P1(t,o) 

—  Sc  '  “  So  "  rfc(t) 


[*-  fft]  ■ 

(25) 


dfi(t 


♦  w(t)  c(t,o,o)  -  0 


(*S) 


The  boundary  conditions  are  the  same  aa  those  given  in  equation  (2l), 


IhWOM  do w  that  the  optimal  control  la  data  mined  or  the  beeia  of  the 
OUftaimty-equi valence.  The  resulting  control  la  given  by  equation  (l6).  It 
doea  sot  agree  with  the  optimal  control  specified  by  equation  (22).  Indeed, 
the  «H  of  the  certainty-equivalence  leads  to  an  incorrect  answer.  The 
sonmplw  presented  here  demonstrates  a  certainty- difference  in  stochastic 
linear  systems  with  time  delay. 

Conclusions 

A  sufficient  condition  for  the  optimisation  of  stochastic  linear  systems 
with  time  delay  is  formally  derived  by  means  of  the  dynamic  programing 
method.  The  result  is  then  applied  to  a  system  in  which  the  plant  described 
by  differential-difference  equations  is  linear  containing  additive  Gaussian 
noise.  It  la  demonstrated  that  the  "principle"  of  certainty-equivalence  holds 
when  the  variance  of  the  plant  nolsa  is  constant.  However,  the  use  of  the 
certainty-equivalence  leads  to  an  Incorrect  answer  when  the  variance  of  the 
plant  noise  depends  upon  the  control.  In  fact,  a  certainty-difference  is 
encountered  in  such  an  example. 
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Appendlx 

rorwal  Privation  of  equation* (15).  (16),  (22) 

By  the  application  of  the  principle  of  optimality,  equation  (IS)  can 
be  expressed  In  the  fora  given  by  equations  (13)  and  (lk).  The  derivatives 
dV/dt  and  d2v/dt2  in  equation  (1*0  are  evaluated  about  the  trajectory  x(t). 

It  is  stipulated  that 

0 

V[xt,t]  »  PQ(t)x2(t)  +  2x(t)  J  P1(t,o)x(t-t-e)  + 

0  0 

♦  J  do  J  da  [x(t*-o)P2(t,o,a)x(t+a)J  +  S(t)  (Al) 

«T  -T 

It  follows  that 

0 

A  •  A  +  2(to(t))  [po(t)5(t)  ♦  J  P1(t,o)J(t^o)do]  ♦ 

*(t)  x(t)  -T 

0 

+  2x(t)  J  P^t,*)  (dx(t+o))  do  +  (A2) 

0  0 

*  J  do  J  da  ^(dx(t+o))Pg(t,o, a)x(t-t-a)  +  x(t-^o)P2(t,o> a)(dx(t+a))J 

where  dx(t)  Is  considered  u  deteminate.  Similarly,  d  V/dt  can  be  written* 
In  expression  (AS),  dx(t)  is  actually  a  random  variable,  which  evolves  in 
tine  according  to  equation  (l).  The  expression  (AS)  is  then  substituted 
for  (dV/dt)^^  in  equation  (lk).  The  tera  (d2v/dt2)j^  in  equation  (lk) 
is  replaced  in  the  sane  wanner.  The  resulting  expression  can  be  written 


aa  follows: 


V 

0  •  “J8  E*,t  {«*2+«*?+©.(t)+»(to(t))[ppi(t>4.  P1(t,ff)  x(W)do]^A 
0 

♦  2x(t)  J  P^t,®)  (dx(t+a))d®/A  + 


0  0 


♦  J  d®  J  dQ^(dx(*t+<y))P2(t,o>a)i(t^aKx(t^e)P2(t,a,a)(<lx(t+a))Jy/A 


*  *  ^i(t)  4  *  (A3) 

Id  order  to  perform  the  expectation  operation,  the  probability  density 
functions  associated  with  d?  and  dx  are  written: 

p[d?(t)]  -  cooet.  exp[-  |(d?)2/(Q1A)j  (AM 

p(dx(t*®)|x(t),  *(t),t,  -t  <  ®  <  0]  - 

const,  exp  ■[-  ^  [dx(t+®)-dx(bfff)]2/[A  1^(1+®)^!  (A5) 

where  3x(t)  if  governed  by  equation 

dx(t)  ■  [a1*(t)  +  agxft-r)  +  bu(t)3  A  (A6) 


Vow,  dx(t)  from  equation  (l)  is  substituted  into  equation  (A3).  Since 
EfAxJa(t),  t,  t  <  t]  »  SBrdxfx(t),  *(t),t,t  <  t]  -  E  .E[Ax|x(t),  t  <  t], 

w  w  ™  Z;  V  O  *“ 


one  can  perform  the  averaging  in  equation  (A3)  (after  the  substitution  of 
dx)  and  the  limiting  process  as  A  approaches  to  aero: 
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0  -  Min  lwx2(t)M»c(t,  0,  oV-ru2(t>-(|y)  +2x(t)  f  P,(t,o)  — to  + 

u  1  1  90 


-T 


+  Z^xftJ+ftgXft-T^buCt)  j  |  PQx(t)  +  j  P1(t,tj)x(t+ff)doJ  + 

0  0 

+  j*  to  J  da  &J%ZL  p2(t,e,  a)x(t+a)  +  x(t+o)P2(t,c,  a)  *■ 

-  T  -  T 

0 

+  |  [p0(t)  +  «!(t)\  ♦  |  j  Pg(t,ff,c)a^(t+(j)Ci1to}  (A7) 

j  -  _  T 


where  use  has  been  made  of  the  fact  that  lim  E[&xCt)/A{x(t),  t  <  t]  ■ 

tr+0  <>_ 

lim  Ax(t) /A  *  dx(t)/dt,  and  that  dx(t+a) /?it  *  &x(t+o)/to. 

A  -O 

Suppose  now  that  ^  is  constant.  Since  C(t,  0, 0)  and  do  not  depend 
upon  the  control  explicitly,  and  the  terms  *x(t+c)/to  and  *x(t+cr) /&a  can 
be  integrated  by  parts  and  do  not  contain  the  control  u  explicitly,  the 
minimum  in  expression  (A?)  is  attained  for  the  control  specified  by  equation 


(16). 

2 

Suppose  now  that  =  2u  q^.  Collecting  now  the  terms  in  equation 
(A7),  which  contain  the  control  u  explicitly,  one  obtains  the  expression 
to  be  minimized: 

0 

ruZ(t)  +  2bu(t)[pQx(t)  +  J  P^t,  a)x(f-a)daj  +C*’0+P1(t,  0)]a^(t)q1u2(t)  + 


-t- 


J  Pg(t,ff,a)a^(t<-or)q1u2(t+a)  da 


(a8) 


The  Bin  Ira  of  the  expreaalon  (a8)  la  achieved  by  cbooaing  the  optimal 
control  u  (t,^)  and  u  (t+o,:^),  -t  <  g  <  0  as  specified  by  expreaaiona  in 
•tattles  (IS).  By  substituting  the  expression  of  the  optimal  control 
beck  Into  equation  (15),  collecting  terma,  the  resulting  equation  yields 
the  equation  governing  PQ(t),  P^t.o)  and  Pg(t ,o,  ot). 
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